4 research outputs found
Recommended from our members
Dâ‚Šâ‚Š : Structural Credit Assignment in Tightly Coupled Multiagent Domains
Autonomous multiagent teams can be used in complex exploration tasks to both expedite the exploration and improve the efficiency. However, use of multiagent systems presents additional challenges. Specifically, in domains where the agents' actions are tightly coupled, coordinating multiple agents to achieve cooperative behavior at the group level is difficult. In this work, we demonstrate that reward shaping can greatly benefit learning in tightly coupled multiagent exploration tasks. We argue that in tightly coupled domains, effective coordination depends on rewarding stepping stone actions, actions that would improve system's objective but are not rewarded because other agents have not yet found their proper actions. To this end, we build upon the current work in multiagent structural credit assignment literature and we extend the idea of counterfactuals introduced in difference evaluation functions.
Difference evaluation functions have a number of properties that make them ideal as learning signal, such as sensitivity to agent's actions and alignment with the global system objective. However, they fail to tackle the coordination problem in domains where the agent coupling is tight. Extending the idea of counterfactuals, we propose a novel reward structure, Dâ‚Šâ‚Š. We investigate the performance of the Dâ‚Šâ‚Š in two different multiagent domains. We show that while both global team performance and the difference evaluation function fail to properly reward the stepping stone actions, our proposed algorithm successfully rewards such behaviors and provides superior performance (166% performance improvement and a quadruple convergence speed up) compared to policies learned using either the global reward or the difference reward
Social Network Based Substance Abuse Prevention via Network Modification (A Preliminary Study)
Substance use and abuse is a significant public health problem in the United
States. Group-based intervention programs offer a promising means of preventing
and reducing substance abuse. While effective, unfortunately, inappropriate
intervention groups can result in an increase in deviant behaviors among
participants, a process known as deviancy training. This paper investigates the
problem of optimizing the social influence related to the deviant behavior via
careful construction of the intervention groups. We propose a Mixed Integer
Optimization formulation that decides on the intervention groups, captures the
impact of the groups on the structure of the social network, and models the
impact of these changes on behavior propagation. In addition, we propose a
scalable hybrid meta-heuristic algorithm that combines Mixed Integer
Programming and Large Neighborhood Search to find near-optimal network
partitions. Our algorithm is packaged in the form of GUIDE, an AI-based
decision aid that recommends intervention groups. Being the first quantitative
decision aid of this kind, GUIDE is able to assist practitioners, in particular
social workers, in three key areas: (a) GUIDE proposes near-optimal solutions
that are shown, via extensive simulations, to significantly improve over the
traditional qualitative practices for forming intervention groups; (b) GUIDE is
able to identify circumstances when an intervention will lead to deviancy
training, thus saving time, money, and effort; (c) GUIDE can evaluate current
strategies of group formation and discard strategies that will lead to deviancy
training. In developing GUIDE, we are primarily interested in substance use
interventions among homeless youth as a high risk and vulnerable population.
GUIDE is developed in collaboration with Urban Peak, a homeless-youth serving
organization in Denver, CO, and is under preparation for deployment
Fair Influence Maximization: A Welfare Optimization Approach
Several behavioral, social, and public health interventions, such as
suicide/HIV prevention or community preparedness against natural disasters,
leverage social network information to maximize outreach. Algorithmic influence
maximization techniques have been proposed to aid with the choice of "peer
leaders" or "influencers" in such interventions. Yet, traditional algorithms
for influence maximization have not been designed with these interventions in
mind. As a result, they may disproportionately exclude minority communities
from the benefits of the intervention. This has motivated research on fair
influence maximization. Existing techniques come with two major drawbacks.
First, they require committing to a single fairness measure. Second, these
measures are typically imposed as strict constraints leading to undesirable
properties such as wastage of resources.
To address these shortcomings, we provide a principled characterization of
the properties that a fair influence maximization algorithm should satisfy. In
particular, we propose a framework based on social welfare theory, wherein the
cardinal utilities derived by each community are aggregated using the
isoelastic social welfare functions. Under this framework, the trade-off
between fairness and efficiency can be controlled by a single inequality
aversion design parameter. We then show under what circumstances our proposed
principles can be satisfied by a welfare function. The resulting optimization
problem is monotone and submodular and can be solved efficiently with
optimality guarantees. Our framework encompasses as special cases leximin and
proportional fairness. Extensive experiments on synthetic and real world
datasets including a case study on landslide risk management demonstrate the
efficacy of the proposed framework.Comment: The short version of this paper appears in the proceedings of AAAI-2